Efficient Synchronization on Multiprocessors with Shared Memory by
نویسندگان
چکیده
A new formalism is given for read-modify-write (RMW) synchronization operations. This formalism is used to extend the memory reference combining mechanism, introduced in the NYU Ultracomputer, to arbitrary RMW operations. A formal correctness proof of this combining mechanism is given. General requirements for the practicality of combining are discussed. Combining is shown to be practical for many useful memory access operations. This includes memory updates of the form mem_val := mem_val op val, where op need not be associative, and a variety of synchronization primitives. The computation involved is shown to be closely related to parallel prefix evaluation.
منابع مشابه
A New Synchronization Scheme for Memory Consistency Model ( Extended Abstract )
Modernistic scalable multiprocessors are mostly built with a distributed-shared memory architecture. Large scale shared memory multiprocessors have long memory latencies for the remote memory access. And these latencies can quickly offset system performance earned from the exploitation of parallelism. In order to improve system performance, we must reduce memory latencies. The useful way for th...
متن کاملSystem Software Support for Reducing Memory Latency on Distributed Shared Memory Multiprocessors
This paper overviews results from our recent work on building customized system software support for Distributed Shared Memory Multiprocessors. The mechanisms and policies outlined in this paper are connected with a single conceptual thread: they all attempt to reduce the memory latency of parallel programs by optimizing critical system services, while hiding the complex architectural details o...
متن کاملAutomatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors
This paper describes an algorithm for deriving data and computation partitions on scalable shared memory multiprocessors. The algorithm establishes affinity relationships between where computations are performed and where data is located based on array accesses in the program. The algorithm then uses these affinity relationships to determine both static and dynamic partitions for arrays and par...
متن کاملFast synchronization on shared-memory multiprocessors: An architectural approach
Synchronization is a crucial operation in many parallel applications. Conventional synchronization mechanisms are failing to keep up with the increasing demand for efficient synchronization operations as systems grow larger and network latency increases. The contributions of this paper are threefold. First, we revisit some representative synchronization algorithms in light of recent architectur...
متن کامل